The data for the current study was sourced from the Atlas of Living Australia website which is an open data source supported by The Australian Government and is hosted by Commonwealth Scientific and Industrial Research Organisation (CSIRO). The data for occurrences of wedge-tailed eagles (Aquila (Uroaetus) audax) across Australia can be accessed through this link to the dataset.
The Atlas of Living Australia is an open data repository for all the information on the bio-diversity of Australia, supported by the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS) and is hosted by The Australian Government and is hosted by Commonwealth Scientific and Industrial Research Organisation (CSIRO).
The current usage of the publicly available dataset is available for use through a CC-BY attribution license. This particular license allows us to reuse, replicate, tweak and reproduce the data as long as we cite the original source of the data. The license can be looked into further detail through the link to licensing details.
The process for obtaining the dataset and storing them in an RDatabase file is as follows:
The dataset is publicly available in the Atlas of Living Austrlia (ALA) website.
In the search bar, we need to type in “Wedge tailed eagle” which will return us the available data with the scientific name of the bird. The keyword for obtaining the data is to use the scientific name for wedge tailed eagle which is “Aquila (Uroaetus) audax”. The step is shown through the image below for reference. The red box shows the keyword entered in the search bar while the red arrow shows the link to the dataset used for the current study.
Once the Galah library has been installed and setup in the R Studio environment, the following code is run to obtain the required dataframe. To limit the size of the dataframe being queried, the following filters are applied to answer the questions outlined by the task.
library(galah)
galah_config(email = "abar0090@student.monash.edu",
download_reason_id = 10,
verbose = TRUE)
eagles <- ala_occurrences(
taxa = select_taxa("Aquila (Uroaetus) audax"))
eagles <- eagles %>%
rename(Longitude = decimalLongitude,
Latitude = decimalLatitude) %>%
mutate(eventDate = as.Date(eventDate)) %>%
filter(!is.na(eventDate)) %>%
filter(!is.na(Longitude)) %>%
filter(!is.na(Latitude)) %>%
filter(eventDate>"2000-1-1") %>% #Filtering data from 1st Jan 2000 to latest data
dplyr:: select(c(Latitude,Longitude,recordID,eventDate,dataResourceName,occurrenceStatus)) #Selecting relevant variables
head(eagles)After the data has been queried successfully and obtained as a dataframe, it is further saved into the Rdatabase file ‘eagles.rda’ using the save function and is shown in the R code chunk below with the appropriate comments.
save(eagles, file=here::here("data/wte.rda")) #Saving dataframe into eagles.rda Since the volume of the dataset is considerably large, it is important to process the data in a manner that lets us create a subset of the data which allows for us to obtain the highest information gain without making the analysis very messy or complex.
It was further decided to obtain the wedge-tailed eagle sightings for all the locations in and around a 100 kilometer radius. For this purpose, a resource for calculating the distance between two sets of latitude and longitudes were used. The distance calculator can be accessed here. An image of the interface of the distance calculator is shown in the image below.
Using the above calculator, a set of 6 latitudes and longitudes around a distance of approximately 100 kms from the Adelaide and Longreach airports were obtained. The image below shows the 6 geolocation points around the Adelaide airport for reference.
From the above image, it can be observed that the locations within a 100 km radius for each of the airports can be obtained by creating a range of +/- 1 degree of the latitudes and longitudes of each of the airport locations. Hence, we can apply the filter function of dplyr library to create the subset of the necessary locations from which, the presence of the eagles can be obtained.
# Adelaide airport
eagles_ade <- eagles %>% filter(Latitude >= -35.9285) %>% filter(Latitude <= -33.9285) #Filtering latitudes
eagles_ade <- eagles_ade %>% filter(Longitude >= 137.5) %>% filter(Longitude <= 139.5) #Filtering longitudes
head(eagles_ade)# Longreach airport
eagles_lon <- eagles %>% filter(Latitude <= -22.4403) %>% filter(Latitude >= -24.4403) #Filtering latitudes
eagles_lon <- eagles_lon%>% filter(Longitude >= 143.2506) %>% filter(Longitude <= 145.2506) #Filtering longitudes
head(eagles_lon)The data is in the file eagles.rda in the
data directory. It contains these variables:
The current dataset contains all the occurrences of wedge-tailed eagles across Australia with latitude and longitude values from 1st January 2000 to the latest updated data records.
The population of the dataset obtained from ALA website provides us with all the records of presence (or absence) of the wedge-tailed eagle for the corresponding latitude and longitude. This dataset of spatial variables would help us understand the population density of the bird in and around the airports of Adelaide and Longreach. Since we are required to assess the chances of bird strikes around these airports, hence, we create the population of the dataset in a manner such that we obtain the presence of the wedge-tailed eagle around 100 kms of these airports by creating a filtered dataset. As we have retrieved the data for the filtered locations for a considerable timeline, hence, the population of the sample dataset is expected to be a representative dataset for the entire population that was obtained from the ALA website.
The reason for the selection of the variables for the current dataset have been delineated as follows :
Latitude : This is one of the spatial variables that will help us pin point a geographical location where a record for the presence (or absence) of the wedge-tailed eagle was observed.
Longitude : This is one of the spatial variables that will help us pin point a geographical location where a record for the presence (or absence) of the wedge-tailed eagle was observed.
recordID : Helps us identify the number of unique observations of birds reported for a particular location and event date.
eventDate : This is the date on which, the record was obtained. A temporal analysis can be performed using this variable as it would help us understand whether the presence of the bird has risen or dropped over the years in the locations close to the airports of Adelaide and Longreach.
dataResourceName : This variable helps us understand the source of the data. This could be an important factor while basing our analysis as Atlas of Living Austrlia conducts regular data quality checks. If at any point, the license of a data resource provide is revoked due to quality concerns, we can identify which observations to filter out from the data at a later point. ALA’s data quality project can be referred through the link here.
occurenceStatus : Whether the record obtained for the particular geographical location on a given date observed the presence or absence of the wedge-tailed eagle.
Since the data obtained from ALA falls under the category of observational data, there could be instances of missing values. These have been filtered out while retrieving the data from the repository.
Since the data is obtained from various sources, there could be inconsistencies in reporting the data.
While the data collection was done in an extensive and granular manner, there could be a lack of precision of the observations made for the exact latitude and longitude.
The dataset obtained from the Atlas of Living Australia only reports for all the wedge-tailed eagles that have been recorded. Hence, it does not report the entire population of these eagles across the selected regions of Australia.
The dataset here is an observational data and in particular, occurences data. Some of the limitations that are prevalent in such datasets are as follows :
__________________________________ End of file ____________________________________